Cooperation, structure and hierarchy in multiadaptive games 
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Game-theoretical models where the rules of the game and the interaction structure both coevolves with the 
game dynamics — multiadaptive games — capture very flexible situations where cooperation among selfish agents 
can emerge. In this work, we will discuss a multiadaptive model presented in a recent Letter [Phys. Rev. Lett. 
106, 028702 (2011)], and generalizations of it. The model captures a non-equilibrium situation where social 
unrest increases the incentive to cooperate and, simultaneously, agents are partly free to influence with whom 
they interact. First, we investigate the details of how the feedback from the behavior of agents determines 
the emergence of cooperation and hierarchical contact structures. We also study the stability of the system to 
different types of noise, and find that different regions of parameter space show very different response. Some 
types of noise can destroy an all-cooperator state. If, on the other hand, hubs are stable, then so is the all-C 
state. Finally, we investigate the dependence of the ratio between the timescales of strategy updates and the 
evolution of the interaction structure. We find that a comparatively fast strategy dynamics is a prerequisite for 
the emergence of cooperation. 



PACS numbers: 02.50.Le,89.75.Hc,89.75.Fb,87.23.Ge 

I. INTRODUCTION 

An open question in both biology and the social sciences 
is how cooperation and the network of social interactions co- 
emerge in a population of selfish individuals. Game theory is 
the basic theoretical framework to investigate such phenom- 
ena Q. Furthermore, game theory is a language for describ- 
ing systems in biology, economy and society where the suc- 
cess of an agent depends both on its own behavior and the 
behaviors of others. Through the development and the study 
of different types of models, researchers have captured differ- 
ent game-theoretic scenarios. Our previous work 1 1] showed a 
new direction, relaxing constraints of other models with feed- 
back at different levels to the behavior of the agents. This class 
of systems can be anticipated to show a rich behavior and 
be interesting for interdisciplinary studies of social systems. 
In this paper, we study the multiadaptive game of Ref. [1 J in 
greater detail and, primarily, extend and simplify it in various 
ways to paint a fuller picture of the class of models that the 
model in Ref. [IJ belongs to. 

We motivate our multiadaptive model as an extension of 
spatial social dilemmas — systems driven by a conflict be- 
tween collective and individual interests, and the interaction 
happen between agents that are close in space. More specif- 
ically, we start from the Nowak-May (NM) game |8| that is 
technically on the border between the Prisoner's dilemma and 
Chicken games. It captures social situations where at any time 
it is most rewarding to defect. However, in some situations, in 
a long-term perspective, agents benefit from establishing mu- 
tual cooperation. More mathematically, each agent can take 
two actions: defect (D) or cooperate (C). Cooperation means. 



in this context at least, that the agents do what is best for the 
community. An encounter in the NM game gives zero pay- 
off to anyone interacting with a defector (D), payoff" one to 
a cooperator (C) meeting another cooperator, and b > 1 to a 
D meeting a C. In the literature, people have used this model 
to explain the emergence of cooperation among selfish agents 
in a vast number of disciplines — political science, economics, 
and biology [2j. 

In the original NM game, the game rules, as parameterized 
by the payoff matrix, are fixed in time. In real-world systems, 
there could well be a feedback mechanism from the overall 
success of the agents, i.e. the society, to the payoff' matrix. 
Imagine for instance that there is a stable, widespread coop- 
eration that builds up a common wealth among the agents. In 
such a situation, there would be more common resources at 
stake at every interaction, and thus a larger temptation to de- 
fect. The easiest way to incorporate feedback from the entire 
system to the game rules is to let the entries of the matrices 
be variables that are dependent on external environment (the 
society in socioeconomic game theory, the environment for 
evolutionary models). This is what we will do and, following 
Ref [3J, we define 

b(t + 1) ^ bit) + a[p(t) - p*], (1) 

where f is the discrete simulation time, p is the fraction of co- 
operators, p* e [0, 1] is a model parameter signifying a neutral 
cooperation level, and c > sets the strength of the feedback 
from the environment to the payoff. The idea behind this form 
is that a high cooperation level means the society gets rich 
which should increase the incentive to exploit this richness, 
and thus increase b. This is not supposed to be regarded as 



a universal mechanism; rather a scenario that could apply to 
a restricted set of social or environmental situations. The lin- 
ear response form is motivated by simplicity; one could also 
imagine a threshold response (in analogy to other models of 
response to social influence 14J). 

The other feature we leave flexible and adaptive is the in- 
teraction structure (cf. Refs. fS'-Tl), i.e. the social network be- 
tween the agents. In the spirit of the "strength of week ties" 
idea 121, we assume the environment of people can be differ- 
entiated into strong local ties to family and work colleagues 
that are hard to break and of little use when it comes to chang- 
ing ones social situation, and weak ties that helps to reach 
information, or to build new social ties, further in the social 
network (the motivating example was how people find new 
jobs). In our model, we will also have local ties that do not 
change. To keep the similarity to the NM game we let them 
be the four neighbors of a square grid. In addition to the local 
ties, each agent has one connection that could reach anyone 
outside of the neighborhood. The agents, we assume, use this 
"weak tie" to optimize their position in the social network by 
connecting it to it best-performing neighbor (including neigh- 
bors of the weak, long-range link, so that this link can wander 
off', away from the local surrounding of strongly connected 
links). This setup, inspired by Ref. |6| will be described more 
algorithmically below. 

The rest of the paper is organized as follows. In Sec. |ll] 
we define what we call the adaptive model essentially the sce- 
nario above without the social-network dynamics. In this sec- 
tion, we also analyze this model numerically. In section [III] 
we present and investigate the full, multiadaptive model, in- 
cluding the emergence of cooperation and social structure. In 



Sec. IV we study the response of a system on noise and, par- 
ticularly, the stability of all-C state. We investigate the role of 
timescale differences between updating strategy and rewiring 
non-local links in Sec. [V] Finally, in Sec. VI we discuss our 
results and open problems. 



11. ADAPTIVE MODEL 

In this section, we will successively move from the NM 
game toward the multiadaptive game mentioned above. First, 
we will include feedback from the environment — the overall 
wealth of the agents — to the rules of the games as parameter- 
ized by the payoff matrix. Later we will investigate the case 
where an agent has a long-range link (which, in this section, 
is not open to optimization). 

The basic set-up is an L x L square grid of agents interact- 
ing with their four nearest neighbors. We let this square grid 
have fixed boundary conditions, so that an agent in the inte- 
rior interacts with four others, an agent on the links interact 
with three others and an agent in the corners interact with two 
others. Unless otherwise stated, we will use L - 100. As 
mentioned above, the global state of the system is determined 
by the temptation to defect b{t,p) given by Eq. ([T]). The initial 
value bo of the temptation is another parameter value of the 
model. 

Starting from a random configuration of defectors and co- 
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FIG. 1: (Color online) Parameter dependence of the game as seen 
in the temptation and average cooperator density on a square lattice 
(left panels). Right panels shows the results adding non-local links. 
Panels (a) and (b) show average density of cooperators p, as a func- 
tion of the initial temptation b^, with a = 0.1, 2, and 4. The bar 
represents points averaged over the last 1000 of 5000 (c) and (e) and 
500 of 1000 (d) and (f) steps. Panels (c)_and (d), and (e) and (f) cor- 
respond to the time evolution of p and b, respectively, for different 
values of bo when a = 4. Panels (g) and (h) show the diagram over 
the three regions in a-bg space. The curves are averages over lO'' 
runs. 



operators in the population, we update the system first by cal- 
culating the total payoff of an agent / as the sum of the payoffs 
obtained in the last, synchronous, interaction. 



Ui = ^ Uij/ 



(2) 



where Ujj is /'s payoff from the NM game in the interaction 
with j, and Ajj is the adjacency matrix, whose value is 1 (or 
0) depending on whether (or not) ; and j are neighbors in the 
social network. When the payoff is gleaned the agents, once 
again synchronously, chose strategies (whether to cooperate 
or defect). If an agent / has a higher total payoff than its neigh- 
bors nothing happens. But, if an agent j, with an link to / has 



3 



a higher total payoff, then / use the same strategy in the next 
timestep as y just did with a transition probabiHty given by |6|, 



n(/ ^ j) 



1 



(3) 



Here /3 parameterizes the noise in the choice of whom to 
imitate. This type of selection noise is further discussed in 



Ref fW\. Except in Sec.|IV] we uscyS = 1. 

In Fig.[T] we give numerical results summarizing the behav- 
ior of the model. In panel (a), we display the average coopera- 
tor density, p, as a function of bo for three values of a. If a = 4 
and b[) < 4.0, we can see the system converging to a state with 
p ~ p* - 1/2. This behavior holds through a region of the 
parameter space that we denote I. For large feo-values, there is 
a transfer to another behavior — ^region III — characterized by a 
vanishing cooperation. We call the value where this happens 
^i.iii- ^i.iib we note, increase with a. This means that if the 
coupling between the overall behavior and the payoff matrix 
gets stronger, defection needs higher initial temptation values 
to take over the population. Between regions I and III there is 
a region IF where, depending on bo, the cooperation density 
either converges to p* or with probabilities depending on a 
and bo. (We save the notation II for another behavior that will 
be discussed below.) With increasing bo, the probability that 
the system ends in p* decreases, and vanishes completely at 
bn',nh In Figs.[TJc) and (e), we display trajectories of p and b, 
averaged over 10"* runs, for bo = 2.5, 5.5, and 9.0 with a - A. 
These curves show that the system stabilizes to a steady co- 
operation level after an oscillatory transient. In terms of con- 
figurations, such oscillations are manifested as growing and 
shrinking C or D clusters. This can be explored further with 
our Java applet of the model |11 1. For all parameter values 
we study, the oscillatory behavior will be dampened to a fixed 
point at (or, at least, very close to) p* . An interesting obser- 
vation is that in region I the temptation can be controlled by 
the feedback so that the final density of cooperators p* is at 
some intermediate value between 1 and (that is, both C and 
D clusters coexist in the fixpoint). Dynamically, region III is 
characterized by the system hitting the fixed point p = faster 
than the environment can respond by tuning the value of b. In 
Fig. [ijg), we plot the boundaries between the regions in the 
a-bo plane. We identify region I numerically as when p, at 
convergence, deviates less than 0.5% from p*, in other words 
that |p-p*| is less than half a percent. The region III identified 
to when the p < 0.005. From Fig.[TJg) we see that the bound- 
ary value, ^1,11., separating region I from 11* increases with 
an increasing a — the coexistence region for C and D becomes 
wider with increasing strength of the feedback. In region I, for 
all measured values of a, the system relaxes to a steady state 
with p X p* and b converges to an intermediate value. For 
example, bo = 2.5 gives b(t — > oo) ^ 1.2 [Figs.[ljc) and (e)]. 
This happens when the feedback in Eq. ([T]i is strong enough to 
balance b. When bo increases beyond bi n-, the feedback from 
the environment starts affecting b so much that the system hits 
the all-D state. As a final note on Fig.[TJg), we see that bn^ m, 
separating region IP from III, increases monotonically with 
a. 

Next, we continue moving closer to the multiadaptive game 



by introducing long-range links to every agent. At this stage, 
they are distributed randomly and not open to optimization. 
When the strength of feedback is weak enough {a = 0.1), p 
shows a behavior similar to the case without non-local links 
[see Fig.[TJb)]. However, the average density of cooperators p 
as a function of bo changes drastically when a is larger (a = 2 
and 4 in Fig.[T]i. For example, if a = 2 then bi n - 1-4, and if 
a - A then bi n - 1 -2. That is to say that the region I where 
the system converges to a state with p p* - 1/2 shrinks 
with increasing a. Strikingly, a new absorbing state p = 1 (an 
all-C state) is appearing in the region II. This region thus have 
three possible steady states p -0,p* and 1, and which one the 
system ends up in is a probabilistic event with a probability 
of the various outcomes that depend on a and bo. In particu- 
lar, there is a sub region in II where the system goes almost 
surely to the all-C state. For example, between 3 < bo < A 
and a - A. This is different from the region 11* (in the case of 
no non-local links) above. Since non-local links shrinks the 
distance scaling in the network (from A^''^ to logA^ or even 
shorter) |[T2ll . C and D clusters have a larger interface, which 
apparently is to the C cluster's advantage. With increasing bo, 
the probability that the system ends in all-C decreases, and 
vanishes completely at bu m. In Figs.[T|d) and (f) we display 
typical trajectories of p and b for bo - 1.1, 3.5 and 8.0 with 
a - A. These curves indicate that the system stabilizes to 
a steady cooperation level after about 10 timesteps. By the 
adaptive payoff dynamics, we can explain the transient oscil- 
lations. Since each node is connected to a random partner by a 
non-local link, it has a chance of connecting to any other node 
and it is not too far-fetched to assume a well-mixed approxi- 
mation. If we, furthermore, assume the strategy adoption rate 
is proportional to the relative success of the strategies, then 
one can approximate the dynamics by the following replicator 
equation system 



dp 
df 
Ab 
df 



-P)(l 



■b) ifpe [0,1] 
otherwise 



— = a{p-p*). 



(4a) 
(4b) 



From the factors p^ and 1 - p in Eq. ( |4a| , we see the fixed 
points and 1 of p. The p - p* fixed point, however, cannot 
be explained by these equations. From Eqs. (j4]), we can under- 
stand the oscillatory behavior at least qualitatively. If we have 
b > I and p > p*, then b will increase and p decrease. After 
some time, this situation will make p lower than p*, i.e. dbjAt 
is negative. From this situation, with a decreasing b and p, 
we will eventually reach a situation where b is less than 1 . If 
this happens before p hits zero, but after it falls below p*, then 
both p and b will start growing again. Overall, this describes 
a cyclic phenomenon, which then, in practice, dampens out to 
an intermediate, non-trivial fixed point, or hits the all-C or all- 
D fixed point. We never see any sustained oscillations. Our 
Java applet of the adaptive game with non-local links gives a 
good illustration of how these oscillations look at a configu- 
ration level [11 1. Perhaps the most interesting observation in 
this simulation is that there is an all-cooperator state for some 
parameter values. As an illustration, consider the bo = 3.5 
curve in Fig.[TJd). At f = 1, p decreases to almost 0. This can 
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FIG. 2: (Color online) The average density of cooperators p of the 
multiadaptive model on randomly but locally connected lattice. Lo- 
cal link are removed with a probability p = OA (a), 0.3 (b), 0.5 (c), 
and 0.7 (d). We use parameters a = 0. 1 , 2, 4 in all panels. 



be explained by the strong initial temptation to defect. A few 
cooperators survive and seed an increasing cooperator cluster 
(cf. the replicator dynamics above). When p becomes larger 
than p*, b starts increasing again, but here b is still too small to 
stop the system from getting absorbed by the all-C state. For 
large bQ (> 3.5), we see that p approaches its final value mono- 
tonically. For smaller values, however, we observe the above- 
explained oscillations. In the interval b^ > feiji, we see that 
the system oscillates to wildly for the system to respond. This 
has the consequence that it hits one of the fixed points. Fig- 
ure [TJh) shows a diagram over the different dynamic regions 
of a-bd parameter space for this case with long-range links. 
The inclusion of the long-range links obviously changes the 
region-diagram quite considerably. We note that the bound- 
ary value, bi ll, separating region I from 11 decreases with an 
increasing a {bi n ~ 2 for a < 0.1 and biji ^ 1 as a grows 
towards 5), and bujn, separating region II from III, increases 
monotonically with a. In Fig. [ijh), we see that region II be- 
comes wider with increasing a, meaning that in our model, 
the presence of non-local interactions enhances cooperation. 



agents / collected their payoff, they look through their neigh- 
borhoods and, in another agent j has a higher payoff, / adopts 
/s strategy, C or D, with a probability given by Eq. Q, and, if 
it updates the strategy, it simultaneously rewires its non-local 
link to the non-local neighbor of j. We require the graph to be 
simple, so we do not allow self-links and multiple links. 

In Fig.[3|a), we plot the average density of cooperators p as 
a function of bo for three values of a = 0.1, 2 and 4. One can 
note that p shows qualitatively similar behavior to the adaptive 
model with non-local links discussed in Section|lI] To be more 
specific, also in the multiadaptive case there are three regions, 
whose boundary values bi n (^ii.iii) decrease as a increases. In 
this case too, the system reaches all-C state for certain values 
of bo in region II, and the time evolution of b and p shows 
similar transient oscillation behavior Nonetheless, there are 
quantitative differences. For example, the systems with bo < 2 
still shows steady-state behavior (p ^ p* = 1 /2) for a - A. 
Thus, in region I, for all measured values of a, the system 
relaxes to a steady state with p x p* and b converges to a 
stable value. For example, if bo = 1.3 we have b(t oa) ^ 
2.6. In addition, cooperation is more strongly promoted by the 
adaptive networks as the defection is effectively more strongly 
inhibited by the feedback from the environment to the payoff 
matrices. 

As a generalization of multiadaptive model, we consider a 
probability p that each local connection of two-dimensional 
lattice is disconnected. Thus, each agent has local links with 
< ki < 4 and an adjustable non-local link. If /:> = then it is 
exactly the same as multiadaptive model above. Controlling 
the probability p, we investigate multiadaptive models on the 
percolation cluster with non-local connections. As shown in 
Fig. |2j the density of cooperators shows qualitatively similar 
behavior when p is small. Increasing p, all-C state disappears 
when a - A from p - 0.5 (two-dimensional bond percola- 
tion threshold). The boundary value separating region I from 
II (II from III) increases with an increasing p. Thus, more 
strong feedback is needed for the system to reach all-C state 
as increasing p. From these results, we find that the local con- 
nections are essential to support cooperation. 

In the following, we will investigate in detail how other fac- 
tors, such as restricting b, finite system size, influence the evo- 
lution of cooperation and the interaction patterns among the 
agents. 



III. MULTIADAPTIVE MODEL 



A. Multiadaptive model witti bounded temptation 



In this section, we go one step further by considering not 
only how interaction determines the evolution of cooperation, 
but also how the interaction patterns themselves emerge, ar- 
riving at the full model of Ref . [JJ . To this end, we extend our 
adaptive model by including a mechanism where the agents 
are allowed to adjust their non-local links to maximize their 
payoffs. As mentioned in the Introduction, we model strong 
ties by keeping the local interactions fixed [fj. 

We update the state of the system, both strategies and non- 
local links, synchronously. At a timestep, each agent plays the 
NM game with all its local and non-local neighbors. After all 



Unless the system has reached a steady-state with a finite 
fraction (0 < p < 1) of cooperators, b grows (if p = 1) 
or decreases (if p = 1) unboundedly. For example, in all- 
C state the b will increase forever according to Eq.(4bi. In 



this situation, as the fixed points in any real system would be 
metastable rather than permanent, b should not be overinter- 
preted. In principle, one can say that when the fixed point 
is reached is the end of applicability of the model. Alterna- 
tively, one can patch the model by imposing a bound on b. 
We limit the temptation in Eq. ([TJ by letting b{t + Y) - B \f 
bit) > B and b(_t + I) - -B if b{t) < -B. The numerical re- 
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FIG. 3: (Color online) Panel (a) shows average density of cooperators 
p, as a function of the initial temptation fooi with a = 0.1, 2, and 
4. Panel (b) shows effects of a bound on the temptation values as 
bit + 1) = B for b{t + 1) > S and b{t + 1) = -S for b(t + 1) < -S, 
otherwise fc(f + 1) follows Eq. Used parameters are a = A and 
yS = 1 . In panel (c) p is plotted as a function of foo for various size 
L with a = A and /3 = 1, and panel (d) shows that iin.iii increases 
logarithmically with L. 



suits for a = 4 of this modified model are plotted in Fig.|3jb). 
When B is relatively large (B - 5), we see that the average 
cooperator density as a function of the initial temptation looks 
qualitatively the same as in Fig[3|a). That is, there are three 
different regions corresponding to the same type of dynamic 
behavior as in the unbounded case. In contrast, if h is more 
restricted (B < 5), the region 111 (where the system sticks to 
the all-D fixed point) disappears. In addition, in region 11 the 
probability the system hits the all-C fixed point decreases as 
B increases. Instead, region 11 where p converges to an in- 
termediate value extends to larger /jQ-values. This gives some 
perspective on the original model of Ref. fl] — if, in the un- 
bounded multiadaptive model, the temptation to defect is too 
large, the system gets into the all-D absorbing state easily be- 
cause of the relatively slow regulation of h compared to the 



faster regulation of p [according to Eqs. ( 4a i and ( 4b i] . 



B. Finite-size effects 

Now we turn to a brief investigation of finite-size effects. 
Figure [3jc) shows the results of p obtained for systems with 
linear size L - 16, 64 and 256. The parameters are a = 4 and 
fi - 1. As shown in Figs.[3|c) and (d), for larger system size, 
the threshold separating the regions 1 and 11, h\^\, saturates 
with L, while h\\^\\(h()) increases logarithmically with L. In 
particular, h\^\ converges to 2, and feu ni scales as a In L with 
a - 0.70(4). In other words, the cooperative regions are more 
stable the larger the system is. 



C. Correlation between game and networlt structure 

In this section, we turn to the relation between game dy- 
namics and network topology. To simplify the analysis, we 
will only consider the network of non-local links, disregarding 
the links of the background square grid. In Fig.|4j we show pk, 
the fraction of cooperators or defectors of a particular degree 
k in the well-converged state (f > 500). Our three different 
regions (as defined by the cooperator dynamics) show differ- 
ent network structure. For region 1, represented by feo = 1-1 
[Fig. Qa)], if ^ > 3, cooperators have a larger p^ than de- 
fectors. Furthermore, all nodes with A: > 41 are cooperators. 
Since the final densities of C and D are equal in such situa- 
tion, a high-degree cooperator can protect its neighbors from 
invasion by defectors, and thus support cooperation. For re- 
gion 11, e.g. when feo = 3.5 [Fig. |4|b)] the system is stuck 
in the all-C state (i.e. pj. = for all K), we find that is 
fairly close to a power-law with exponential cutoff and a de- 
cay exponent is about 2.7. Since the final state, in this case, 
is all-C, the payoff an agent can gather will be a linear func- 
tion of its degree. Hence, during the rewiring process, the 
probability of getting new links of the agents will roughly be 
proportional to their current degrees. In growing networks, 
"preferential attachment" — that the probability of a node to 
receive a new link is proportional to its degree — is known to 
generate a power-law degree distribution lfT3l . In this case, 
where the networks are not growing Ref [T?! shows that pref- 
erential attachment needs to be balanced by an antipreferential 
deletion of links in order for a power-law degree distribution 
to appear In our networks, the power-law-like degree dis- 
tribution remains for larger feo even though p^, in the steady 
state, varies. For systems in the all-D state, the rewiring pro- 
cess works differently than when p - I- Since the payoff of a 
defector is degree independent, then its non-local link will be 
rewired randomly to another D. This explains Fig.|4];c) where 
we can see that the generated networks have a more narrow, 
peaked degree distribution. 



D. Emergent networii structure 

Now we will turn to the network structure of the multiadap- 
tive game model extending the analysis in Ref. [T|. From Fig- 
ure |4] we understand that the coevolution of the contact pat- 
terns and the payoff matrix, in region 11, changes the under- 
lying network from its initially random graph with a Poisson 
degree distribution to a skewed and fat-tailed degree distribu- 
tion. In Fig. |5ja), we see that the cumulative degree distri- 
bution (the probability an observed degree k is larger than K) 
depends strongly on bo. This is especially true for bo = 3.5, 
where the distribution seems to follow a power-law scaUng for 
over two decades (which is quite much considering the rela- 
tively small L - 100 system sizes). In Fig.|5jb), we make a 
more detailed investigation of the hierarchical features of the 
steady-state networks in greater detail. This hierarchy can be 
characterized by the clustering coefficient (the fraction of pos- 
sible triangles a node is member of with given the degree) of 
a node with degree k. If the clustering decays with degree as 



6 




k k 



FIG. 4: (Color online) Correlations between the strategy and network structure. Circles (squares) correspond to the average density of 
cooperators (defectors) with degree k, pi^. Panel (a) is for = 1.1 (region I), (b) is for bo = 3.5 (region II), and (c) is for = 8 (region III). In 
panel (b) the exponent of the power-law is 2.7 ±0.1. 
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FIG. 5: (Color online) Structural properties of network in the steady 
state for different values of bg when a = 4. Panel (a) displays the 
cumulative degree distribution. Panel (b) shows the clustering coeffi- 
cient C as a function of degree k. The line marks a scaling inversely 
proportional to the degree. We investigate the degree-degree corre- 
lations by plotting the average neighbor degree fcnn as a function of 
degree k (c) and the average assortativity r as a function of bg (d). In 
plots (a)-(c) the initial temptation is 1.1, 3.5 and 8 respectively. 



C(k) ~ A; ' lfT6l . the network is claimed to have a hierarchi- 
cal structure wherein the nodes of highest degree connected to 
a level below, which in turn connected to a level below, and 
so on. This scaling quantifies the coexistence of a hierarchi- 
cal structure of nodes with different degrees. This is indeed 
what we observe for large feo-values. To investigate the de- 
gree coiTelation between nodes at either side of an link, we 
first measure the average degree of the nearest neighbors k^n 
as a function of degree, k ifTTl . If there would be no degree 
correlations then, ^nn(^) would be degree independent. This 
is not the case for large bo-values which is disassortatively 
mixed, i.e. highly connected nodes have a tendency to be con- 
nected to low-degree nodes and vice versa. [See Fig. |5|c)]. 
We also measure the assortativity r 1 18| as a function of bo in 
Fig. |5jd). The assortativity confirms the conclusion from the 
cooperation level plots of Fig. [3] — for the complex intermedi- 
ate region II, the r is larger than in the other regions mean- 
ing that relatively many large-degree nodes are connected to 
other large-degree nodes, and low-degree nodes to low-degree 




FIG. 6: (Color online) Response to noise in strategy selection, p as a 
function of bo for a = 4 with varying p. 




FIG. 7: (Color online) Noise effects on pairwise exponential compar- 
ison dynamics. We show p as a function of bg for the same parameter 
values as in Fig.|4ja). 



nodes. Metaphorically, one can see the diversity of behav- 
iors in this region as the result from a power struggle between 
hubs, where the cooperator hubs win for some bo-valnes and 
the defector hubs win for others. 



IV. MULTIADAPTIVE MODEL WITH NOISE 

In this section, we test extensions of our multiadaptive 
model to incorporate various types of noise. 

A. Tuning the strategy-selection noise via f) 

In our multiadaptive model, a and bo are the most funda- 
mental parameters regulating the individual behavior The pa- 
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rameter p serves as a control parameter for the noise (or uncer- 
tainty) in the selection process. More concretely, it can be re- 
garded as the reciprocal of noise intensity IfTOl . The larger the 
value of y6, the less obvious the noise effect. Under the update 
rule of Eq. ([3]l, the strategy of a better performing neighbor 
is likely to be adopted, while it is also possible (if unlikely) 
that the strategy of a worse performing neighbor is preferred 
occasionally. In the limit of /? — > all information is lost, 
that is, the agents are unable to retrieve useful information 
from the interaction, and just switch to their strategies as by 
tossing a coin. In order to study the noise effect on the evo- 
lution of cooperation, we calculate how p changes by varying 
p. Figure |6] shows p as a function of for different j6 in the 
case of a = 4. Qualitatively, p versus different bo are sim- 
ilar, we find visible quantitative difference among them. In 
particular, the range of the region II where the system evolv- 
ing to all-C state expands to large b^ regime as yS decreases, 
and the threshold value bi n separating the region I and II also 
increases monotonously with decreasing fi. 

If one replaces the hard selection criteria that we use (fol- 
lowing the Nowak-May game 1 8 1 and much of the subsequent 
literature) — that an agents can only imitate a neighbor that 
performs better than itself — ^by a softer probabilistic rule, fol- 
lowing Eq. |3] also for negative Uj - ui, then the system will 
be more strongly affected by the noise. Potentially, such rules 
can break cooperative states. In Fig. [7] we update the simu- 
lation by such a random pairwise comparison dynamics and 
notice that the all-C state of Fig.|4|a) is replaced by regions of 
alternating higher and lower p. Still, in some of these regions 
the cooperation level is well over 50% and independent of a 
above some threshold (just like Fig.|7]i. We leave it for future 
studies to investigate the origin of the complex b^ dependence 
in the intermediate region and whether or not the model can 
reach the all-C state with this type of updating dynamics. 



B. Strategy mutation 

In order to investigate the response of the all-C state to 
noise, we proceed by adding a stochastic change in the strate- 
gies. For simplicity, we start from a system consisting of only 
cooperators in the steady state by setting the parameters a - A, 
/3 - I, and bo = 3.5. We try two cases where we flip the strat- 
egy from D to C or C to D once every hundredth timestep, 
either at the node of largest degree, or a random node. We 
also test a case where the nodes flip with a random change. 
The reason for the mutation rate is that we want to make the 
system able to recuperate to an all-C state (which takes less 
but about 100 timesteps |T|). 

In the first case, the agent located on the node with the 
largest degree changes its strategy to the opposite for every 
time interval A. Here, we set A to 100, which we believe is 
reasonable since the cooperation density of the multiadaptive 
model is stabilized after about 50 timesteps of relaxation. As 
shown in Fig. [8|a), if D appears on the largest hub will be 
rapidly spread to the whole system, and as a result all-C state 
changes to all-D state. On the other hand, we observe simi- 
lar phenomenon if a cooperator appears on the largest-degree 



hub in an environment of all-D members — all D state will be 
change to all-C sate shortly after the perturbation. Thus, the 
system is alternatively switching between all-C and all-D. 

In the second case, we apply the above perturbation to a 
randomly selected node. From Fig. |4|b), we know that the 
final interaction network has a degree distribution similar to 
a power-law. This means that most of the players have small 
neighborhood. Consequently, the disturbance is most likely 
to affect nodes of low degree. In such situation, a disturbance 
through mutation (or mistakes) cannot spread to the whole 
system. This is because the high-degree cooperators can pro- 
tect their neighbors from imitating defectors, like the behavior 
seen in the region I of the multiadaptive model without noise, 
where p approaches to p* as t increases [See Fig.[8jb)]. 

The last type of perturbation we consider is that each agent 
has a probability 2 x 10"* to mutate per timestep regardless 
of payoffs. Figure [8]^c) shows the time evolution of p when 
applying this type of perturbation to an all-C state. A typical 
picture can be seen in Fig.[8];b), the high degree C agents and 
their neighbors are not affected so much since the strategy of 
agents placed on nodes with low degree are mainly mutated 
in the most of time, and p fluctuates around p* as t increases. 
Taken together, the all-C state needs to be stabilized by hubs; 
once the hubs change their strategies by mutation, the all-C 
state is no longer stable. 



V. TIME SCALES 

Up to now, we only investigated the case where the dynami- 
cal variables in our multiadaptive model evolve with the same 
timescale for network updates, strategy updates and feedback 
to from the payoff of agents to the payoff matrix (b, to be spe- 
cific). This similarity of timescales makes sense especially 
in the context of evolution and population biology where one 
timestep of the simulation corresponds to one generation (so 
the time is naturally discrete). It could be appropriate in so- 
cial systems too, whenever processes happen at the similar 
timescales. However, there are also socioeconomic situations 
that would be better modeled as having different timescales 
for different processes. A natural way of implementing this 
is to use asynchronous updating where, at every timestep, one 
randomly chosen agent changes its strategy. By this method, 
one can easily tune the timescales of the processes. Another 
option is to use partially synchronous updates, where one of 
the steps is synchronous, others asynchronous. 

In this section, we consider distinct timescales for the up- 
dating of agents' strategies, the evolution of temptation and 
the interaction structures. We first study the case of ran- 
dom asynchronous updating. Here we go through all agents 
in a random order (the order is different from timestep to 
timestep). For each agent we updating its strategy and im- 
mediately thereafter we readjust b and perform the rewiring 
of the non-local link. We find that under the random asyn- 
chronous updating scheme, the all-C state disappears from re- 
gion II. This is in contrast to that in the case of synchronous 
updating; see Fig. |9|a). This effect of the random update is 
probably due to that now agents have more information, on 
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FIG. 8: (Color online) Three cases of strategy noise via mutation 
(change from D to C or C to D). The time evolution of p for different 
types of perturbations under the parameterizations a = 4, f5 = 1, 
and bo = 3.5. The strategy of an agent, who is placed on the largest 
hub (a) or randomly selected node (b), is changed to the opposite 
(flipping) for every time interval At = 100. In the panel (c), each 
agent has a probability 2 x 10"* per timestep to regardless of payoffs. 
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FIG. 9: (Color online.) An investigation of the effect of relative 
timescales. The average density of cooperators for random updat- 
ing (a), more frequent strategy updating (J3i = 1, /St = 0.01) (b), 
and more frequent link rewiring (fii = 0.01, = 1) (c). We set the 
parameter a to 4. 



average, to guide their decisions which strategy to use. Next, 
we tune the relative timescales of the network and strategy 
updating. To do this, we separate the probabilities for updat- 
ing strategies and the social network, let Pi (/ — > j) represent 
strategy updating and P2ii — > j) for links rewiring, where 

The parameters P[ and /32 control the probabilities (hence the 
speed, or timescale) of the evolution. Let us define the aver- 
age time Ti for link rewiring to be occurred once. The average 
number of link rewiring occurred until time f, ni, equals to 
tPi. The average time ti is in inverse proportional to «i, and 
then Ti ~ ~ j^. Using Eq. (|5j), we get ti ~ 1 + exp"^"^", 
where Am = uj - m,. Because we use best rule, the neigh- 
bors payoff is always larger than is, uj > m,. In sum, the 
average time for link rewiring occurred once is a decreasing 
function of Pi. By the same calculation, the average time for 
strategy updating occurred once is decreasing function of /32- 
Thus, we can control the time scale of strategy updating and 
link rewiring separately with /3i and /32- Given the feedback 



strength a, the system has all-C state if the inequality /3i > ySa 
is satisfied. Figures |9|b) and (c) show p as a function of bo 
for two different sets of parameters (fix, (32) - (1,0.01) and 
(0.01,1) respectively. When strategy updating is more fre- 
quent than link rewiring (i.e. fix > P2), we observe a similar 
behavior of p versus bo as in Fig. |3ja). On the other hand, 
when link updating is more frequent than strategy updating, 
the system never reaches the all-C boundary; see Fig.|9|c). In 
region II, the system reaches either the p = p* or the p = 
state, and the probability getting into all-D state increases with 
increasing bo- The effect of more frequent link updating is 
similar to the random dynamics in the sense that the all-C state 
is lost. This suggests that the random dynamics efficiently 
slows down the updating of strategies. From this, we learn 
that the existence of the all-C state requires a comparatively 
faster strategy dynamics compared to the link dynamics. 
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VI. DISCUSSION 

In this paper, we have studied a game-theoretical model 
with feedback from the behavior of the agents to the rules of 
the game, via the payoff matrix, and an active optimization of 
both the contact structure between the agents and their strate- 
gies. We investigate this model, first presented in Ref. UJ, 
by extending it in many ways. With respect to the average 
cooperation density, the model is a nonequilibrium model (in 
the statistical-mechanics sense). This makes the initial temp- 
tation value b() a crucial model parameter. Like in Ref. ||T|, we 
identify three regions of distinct dynamic behavior for a large 
parameter space and different generalizations of the model. 
In region I, the average cooperator density relaxes to a sta- 
ble level through damped oscillations; in region III the sys- 
tems reaches an all-defect state. For intermediate bo-walnes 
(region II), the system ends at one of three fixed points, 0, 
p* or 1, with parameter-dependent probabilities. For some pa- 
rameter values in this region, the original multiadaptive model 
will almost certainly reach an all-cooperator state. This all- 
cooperator state is absorbing, but when we extend the model 
by adding noise, this state rarely appears. More precisely, if 
the hubs of the network can mutate their strategies, the all-C 
state will not be stable — the all-C state needs to be stabilized 
by cooperator hubs. When we tune the timescales between 
link and strategy updates, we find that the all-C state needs 
a faster strategy update; if the link dynamics is to frequent, 
then the all-C state is instable. An interesting aspect of the 
all-C state is that has power-law like degree distribution with 
a C ~ l/k scaling of the clustering coefficient (a hallmark of 
hierarchical organization |fT6l ). Traditionally, hierarchies are 
usually explained as consequences of factors external to the 
social system, e.g. age or fitness |19|. 

We use several different updating rules — ^random updating 
with following the best (Fig. 7(a)) and updating rule with mu- 
tation probability (Fig. 6(c)). Additionally, we investigate 
the model with different updating rule, in which each agent 
chooses the random neighbor and imitates his strategy with 
the probability of Eq. [3] The system doesnt go to all-C state 
anymore (not shown). We think this result is related to the 
time scale of link rewiring. Under this updating rule, an agent 
needs more time to find the most profitable neighbor with non- 
local link. This means that link rewiring is effectively slower 
than strategy updating. However, since this time-scale differ- 
ence isnt explicit the average cooperation has high level (not 
all-C) in some range of bo. 

In the case of a < 0, we can expect the result obviously. 
There is only one directional feedback, accelerating cooper- 
ation or defection. Since the temptation of defect at initial 
time (f = 0) is larger than 1 and the temptation is increas- 
ing for pc < p*, agents prefer to act as defector and b is 
always increasing. Finally, the system always goes to all-D 
state. On the other hand, we can think quite narrow region in 
a <Q case. Suppose that Zjq is small (for example, bo = 1.01), 
pc(0) » Pd(0) (for example, pc(0) - 0.99), and negatively 
strong alpha (ex. a - -4). In this setting, the system reaches 
the all-C state. 

To epitomize, our work shows a generaUzation of spatial 
social dilemma models where hierarchies can emerge in a co- 



operative state. In our framework, these hierarchies need sta- 
ble cooperating hubs to persist. In this sense, the hierarchies 
are more the result of an all-cooperative state than a prereq- 
uisite for its emergence. We note that in the literature, there 
are conflicting results on whether or not hierarchy promotes 
cooperation, or not ll20l — in different games the effect can be 
different. 

We believe there are many interesting multiadaptive direc- 
tions to for Nowak-May type spatial games. In this work the 
interaction network and the payoff matrix is controlled by the 
game, one can also imagine situations when the dynamics (the 
timescales) is a outcome of the game dynamics and the agents 
are more heterogeneous (so their payoff can be reinvested into 
their ability to play the game). 
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Appendix: Algorithmic description 

In this appendix we give a more detailed description of the 
multiadaptive model of Section [Hi] The initial state of the 
model is generated as follows. 

1 . Construct a Lx L square lattice with closed boundary 
conditions (so the corner vertices interact with two oth- 
ers and the edge vertices interact with three others). 

2. Connect every vertex with a random other vertex. 

3. Assign strategies C or D randomly to all vertices. 

4. Set b{t = 0) = bo. 

Then, from the initial state the system is updated by doing 
what is listed below. We describe the transition from timestep 
f to f -I- 1 . 

1. For every vertex (in arbitrary order), calculate the 
payoff with the interaction with the neighbors by the 
Nowak-May rulesan interaction between a D and C 
contributes with 1 to the score of the defecting vertex 
and to the cooperator, an interaction between two C 
gives b(t) to both while two D gives no profit. 

2. Go through all vertices and let them copy the strategy of 
the neighbor (including themselves) that has the largest 
payoff in the previous step, also rewire the long-range 
link to the neighbor (excluding themselves) that has the 
largest payoff. 

3. Calculate ^(f -h 1) by Eq.[T] 
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